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•  Field  and  remote  observations 

•  Models: 

-  Dynamical 

-  Measurement 

-  Error 

•  Assimilation  schemes 

•  Sampling  strategies 

•  State  and  parameter  estimates 

•  Uncertainty  estimates 

•  A  Dynamic  Data-Driven 
Application  System  (DDDAS) 


Fig.  1.  DaLa  AfHlmllALiaa  fiyfiLam  fidumAlic 


LOOPS/Poseidon 

Adaptive  Interdisciplinary  Ocean  Forecasting 
in  a  Distributed  Computing  Environment 


Research  coupling  Physical  and  Biological  Oceanography  with  Ocean  Acoustics. 

More  effective  Real-Time  Ocean  Forecasting  for  Naval  and  Maritime  Operations, 
Pollution  Control,  Fisheries  Management,  Scientific  Data  Acquisition,  etc. 

MIT  OE  (IT,  Acoustics)  and  Harvard  DBAS  (Ocean  Physics-Biology-Acoustics). 


Key  points 

•  Web  interface 

•  Remote  visualization 

•  Metadata  for  code  and  data 

•  Metadata/Ontology  editors 

•  Legacy  application  support 

•  Grid  computing  infrastructure 

•  T ransparent  data  access 

•  Data  assimilation  (ESSE,  01) 

•  Interdisciplinary  interactions 

•  Adaptive  modeling 

•  Adaptive  sampling 

•  Feature  Extraction 

•  Prototype  for  community-use 
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Physical-Biological- Acoustical 
Oceanography  with  HOPS 


Harvard  Ocean  Prediction  System  -  HOPS 
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Real  Time  and  Historical 
Database 
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Primitive  Equation  (PE) 
physical  dynamics  model 

Multiple  biological  models 

Interfaces  to  acoustical  models 

Adaptable  to  different  domains 

Nested-domains  parallelism 

Software:  F77-matlab-C 

I/O:  NetCDF,  stdin 


Applications 


Error  Subspace  Statistical  Estimation  (ESSE) 
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•  Uncertainty  forecasts  (with  dynamic  error  subspace,  error  learning) 

•  Ensemble-based  (with  nonlinear  and  stochastic  model) 

•  Multivariate,  non-homogeneous  and  non-isotropic  DA 

•  Consistent  DA  and  adaptive  sampling  schemes 

•  Software:  not  tied  to  any  model,  but  specifics  currently  tailored  to  HOPS 


IT  Design  Motivations 

Real-time  predictions  of  interdisciplinary  ocean  fields  and  uncertainties 

-  Data  Assimilation  (DA)  using  ESSE  is  currently  ensemble-based  and  thus 
ideal  for  high  throughput  distributed  computing 

-  Interdisciplinary  interactions  and  multiscale/nested  simulations  ideal  for 
parallel  computing 

Develop  autonomous  adaptive  models  for  physics  &  biology 

-  Adaptive  parameter  values,  model  stmctures  and  state  variables 

-  Error  metrics  and  criteria  for  adaptation 

Towards  automated,  distributed  management  of  observed  and  modeled  data 

-  Consistent  use  of  metadata  helps  provide  transparent  data  management, 
including  quality  control 

-  Forecasting  workflow  is  being  automated,  including  DA 

Web  access  from  lightweight  clients  eases  operational  use  and  system 
control 

Interactive  visualizations  for  better  understanding  and  decision-making 


Software  Strategies 

Exploit  parallelism  (especially  throughput)  opportunities 
Maximize  performance,  facilitate  users,  but  limited  changes 

-For  new  generalized  adaptive  biological  model:  MPI  coding 
-For  existing  software:  automate  file  I/O  based  workflows 
-Work  to  the  maximum  extent  possible  at  the  binary  level 

-  Metadata  for  software  use  (and  installation)  in  XML 
Use  Grid  technologies 

-For  user:  compute  and  data  access  solutions 

-  Drive  forecasting,  visualization  workflows  on  the  Grid 
-Present  results  to  user's  web  browser 


GRID  COMPUTING  -  MIDDLEWARE 
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ADAPTIVE  SAMPLING 


Interdisciplinary  Data  Assimilation  (DA) 


Is  in  its  infancy,  but  can  contribute  significantly  to 
understanding  physical-acoustical-biogeochemical 
processes,  including  quantitative  development  of 
fundamental  models 

Required  for  interdisciplinary  ocean  field 
prediction  and  parameter  estimation 

Model-model,  data-data  and  data-model 
compatibilities  are  essential 

Care  must  be  exercised  in  understanding,  modeling 
and  controlling  errors  and  in  performing  sensitivity 
analyses  to  establish  robustness  of  results 

Dedicated  interdisciplinary  research  needed 


Coupled  Physical-Acoustical  Filtering  via  ESSE 


Coupled 
assimilation  of 
sound-speed  and 
TL  data  for  a  joint 
estimate  of  sound- 
speed  and  TL 
fields 

Twin-experiments: 

•“Truth”  ocean 
physics  assimilates 
natural  data 

•Provides  3  CTDs 

•Corresponding  TL 
“truth”  provides 
towed-receiver  TL 
data,  every  500m  at 
75m  depth 


C1  prior-C1  true  (-6  to  6  m/s)  C1  post1  -Cl  true  (-6  to  6  m/s)  C1  post-CI  true  (-6  to  6  m/s) 
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Coupled  Physical-Biogeochemical  Smoothing  via  ESSE 
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Nowcast  :  25  Aug  1998 


Min=  7.6638E-04  Max=  8.3537E+00 
8.00  Day  Forecast  :  3  Sep  1998 


Cross-sections  in  Chl-a 
fields,  from  south  to  north 
along  main  axis  of 
Massachusetts  Bay,  with: 

a)  Nowcast  on  Aug.  25 

b)  Forecast  for  Sep.  2 
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c)  2D  objective  analysis 
for  Sep.  2  of  Chl-a  data 
collected  on  Sep.  2-3 

d)  ESSE  filtering  estimate 
on  Sep.  2 


Coupled  Physical-Biogeochemical  DA  via  ESSE  (continued) 
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e)  Difference  between 
ESSE  smoothing 
estimate  on  Aug.  25  and 
nowcast  on  Aug.  25 

f)  Forecast  for  Sep.  2, 
starting  from  ESSE 
smoothing  estimate  on 
Aug.  25 

(g) :  as  d),  but  for  Chl-a 
at  20  m  depth 

(h) :  RMS  differences 
between  Chl-a  data  on 
Sep.  2  and  the  field 
estimates  at  these  data- 
points  as  a  function  of 
depth  (specifically, 
‘‘RMS-error”  for 
persistence,  dynamical 
forecast  and  ESSE 
filtering  estimate) 


How  Gaussian  are  biogeochemical  error  forecast  distributions? 


(a)  Skewness  NOs  error 
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(b)  Skewness  Chi  error 
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Interdisciplinary  Adaptive  Sampling 


Use  forecasts  and  their  uncertainties  to  alter  the 
observational  system  in  space  (locations/paths)  and  time 
(frequencies)  for  physics,  biology  and  acoustics. 

Locate  regions  of  interest,  based  on: 

•  Uncertainty  values  (error  variance,  higher  moments,  pdfs) 

•  Interesting  physical/biological/acoustical  phenomena  (feature 
extraction,  Multi-Scale  Energy  and  Vorticiy  analysis) 

•  Maintain  synoptie  accuracy 

Plan  observations  under  operational,  time  and  cost 
constraints  to  maximize  information  content  (e.g.  minimize 
uncertainty  at  final  time  or  over  the  observation  period). 


Integrated  Ocean  Observing 
and  Prediction  Systems 


HOPS/ESSE-  AOSN-II  Accomplishments 


23  sets  of  real-time  nowcasts  and  forecasts  of  temperature,  salinity 
and  velocity  released  from  4  August  to  3  September 

10  sets  of  real-time  ESSE  forecasts  issued  over  same  period:  total 
of  4323  ensemble  members  (stochastic  model,  BCs  and  forcings) 

Adaptive  sampling  recommendations  suggested  on  a  routine  basis 

Web:  http://www.deas.harvard.edu/~leslie/AOSNII/index.html 
for  daily  distribution  of  forecasts,  scientific  analyses,  data  analyses, 
special  products  and  control-room  presentations 

Assimilated  ship  (Pt.  Sur,  Martin,  Pt.  Lobos),  glider  (WHOI  and 
Scripps)  and  aircraft  SST  data,  within  24  hours  of  appearance  on 
data  server  (after  quality  control) 

Forecasts  forced  by  3km  and  hourly  COAMPS  flux  predictions 


Real-time  Adaptive  Sampling  -  Pt.  Lobos 


•Large  uncertainty  forecast  on 
26  Aug.  related  to  predicted 
meander  of  the  coastal 
current  which  advected  warm 
and  fresh  waters  towards 
Monterey  Bay  Peninsula. 

•Position  and  strength  of 
meander  were  very  uncertain 
(e.g.  T  and  S  error  St.  Dev., 
based  on  450  2-day  fcsts). 

•Different  ensemble  members 
showed  that  the  meander 
could  be  very  weak  (almost 
not  present)  or  further  north 
than  in  the  central  forecast 

•  Sampling  plan  designed  to 
investigate  position  and 
strength  of  meander  and 
region  of  high  forecast 
uncertainty. 


AOSN-IT  Pt  Lobos 
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ESSE  field  and  error  modes  forecast  for  August  28  (all  at  10m) 
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Real-time  Adaptive  Coupled  Models 
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•Different  Types  of  Adaptive  Couplings: 

•Adaptive  physical  model  drives  multiple  biological  models  (biology  hypothesis  testing) 
•Adaptive  physical  model  and  adaptive  biological  model  proceed  in  parallel,  with  some 
independent  adaptation 
•  Implementation 

•For  performance  and  scientific  reasons,  both  modes  are  being  implemented  using  message 
passing  for  parallel  execution 

•Mixed  language  programming  (using  C  function  pointers  and  wrappers  for  functional 
choices) 


Generalized  Adaptable  Biological  Model 
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A  Priori  Biological  Model 


Example:  Use  P  data  to  select  parameterisations  of  Z  grazing 


Table  1 .  Parameterization  of  grazing  on  multiple  types  of  prey  with  passive 
selection  {gmax-  maximum  grazing  rate;  K:  Half-saturation  constant  (but 
saturation  constant  in  Eq.  1);  Pq  threshold  below  which  grazing  is  zero;  pf. 
preference  coefficient;  ?  ,a,?:  constant). 


Function 

References 

(i)p 

8t=- 

rectilinear 

for  R<K 

K 

for  R>K 

R=ip>f 

1=1 

Armstrong,  1994 

(2)  Ivlev  function  for  each  prey  type: 

Leonard  et  al.,  1999 

(3)  Ivlev  function  with  interference  between 
prey  types: 

g,  =  (i - e ,  with  R  = 

Hofmann  and  Ambler, 
1988 

(4)  Mechanistic  disc  function: 

Si  =  ^max 

7  =  1 

Murdoch  and  Oaten, 

1975;  Holt,  1983; 
Gismervik  and  Anderson, 
1997;  Strom  and  Loukos, 
1998 

(5)  Michaelis  Menten  Function: 

PiP 

S  i  S  max  n 

K  +  ZpjP 

7=1 

Murdoch,  1973;  Real, 
1977;  Moloney  and  Field, 
1991;  Verity,  1991; 
Gismervik  and  Anderson, 
1997;  Strom  and  Loukos, 
1998 

(6)  Threshold  MM  function: 

8=8  f  with  R-YpP 

Evans,  1988;  Lancelot  et 
al.,  2000 

(7)  Modified  MM  function: 

P,P, 

S  i  S  max  n 

1+  I  PjPj 

1  =  1 

Verity,  1991;  Fasham  et 
al.  (1999)  and  Tian  et  al. 
(2001) 

Table  2.  Parameterization  of  grazing  on  multiple  types  of  prey  with  active 
switching  selection  (gmax-  maximum  grazing  rate;  K:  Half-saturation 
constant;  Po  threshold  below  which  grazing  is  zero;  pf.  preference 
coefficient;  a,  a,  v.  constant). 


Function 

References 

(1)  Switching  MM  predation: 

PiP^ 

Si  S  max  n  n 

^ILPjPj+ILPjP' 

7=1  7=1 

Fasham  et  al.,  1990; 
Strom  and  Loukos,  1998; 
Pitchford  and  Brindley, 
1999;  Spitz  et  al.,  2001 

(2)  Mechanistic  disc  switching  predation: 

b,Nf 

gi  ^max  ,  ,  .  ,2 

"  b,hiN, 

(I  +  c.AXI  +  Zt^^) 

Chesson,  1983 

(3)  Generalized  switching  function: 

s  -<r  » 

o  1  o  max  i  fi 

IKpaT 

i^l 

Tansky,  1978;  Teramoto, 
1979;  Matsuda  et  al., 
1986 

(4)  Generalized  switching  function: 

(p  p  Y 

Vance,  1978 

(5)  Generalized  switching  MM  function: 

o  1  o  max  n 

Gismervik  and  Tkndersen 
(1997) 

(6)  Generalized  switching  MM  function: 

_  (p.w-p.))- 

o  i  <5  max  n 

1  +  Z(a(7^-Po,))'” 

This  work 

Distributed/Grid  Computing,  Forecasting  and  Data  assimilation 

with  Legacy  codes 

•  Distributed  technologies  (Sun  Grid  Engine)  with  web  portal 
front-end  ready  to  be  tested  with  ESSE  and  HOPS 

•  Partial  parallelism  within  ESSE  easy  because  open-source 
routines  (Sun  Lapack)  were  used  from  the  start 

•  HOPS,  ESSE  and  acoustics  codes:  Fortran-matlab  legacies 

-  Relatively  complex  codes  and  makefile  options 

-  Hundreds  of  build  and  runtime  parameters 

•  For  other  (future)  codes,  source  code  might  not  be  available 

•  Classic  encapsulation  techniques  that  compartmentalize  the 
code  into  subroutines,  called  from  wrappers  require  constant 
reworking 

•  Thus:  we  chose  to  encapsulate  at  the  binary  level,  with  generic 
approach,  so  as  to  handle  new  codes  with  limited/no  rewriting 


Metadata  for  handling  legacy  software 

Hierarchical  structure  for  describing  code  (can  also  handle  binary-only  case) 
Basic  assumptions  about  codes  thus  encapsulated: 

•  No  independent  GUI,  all  runtime  control  from  the  command  line  and 
input/stdin  files 

•  All  build-time  parameterization  done  by  altering  the  makefile  and 
selecting  values  (parameters)  in  include-files 

Datatypes  and  relevant  ranges  for  each  parameter  checked  to  ensure  validity 
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XML  Encapsulation  for  Legacy  Binaries 

Descriptions  of  I/O  files,  runtime  parameters,  stdin  and  command  line 
arguments,  makefile  parameters,  requirements  and  conflicts  for  options, 
invocation  mechanisms  are  needed: 

-  Essentially  a  computer  readable  install  and  user  guide 

-  XML  description  provides  software  and  build  metadata 

-  Design  of  appropriate  hierarchical  XML  Schemas  (evolutionary) 

-  Simulation  datafile  metadata  are  also  usable  (e.g.  NcML  for  NetCDF) 

-  Provides  the  constraints  for  generation  of  workflows  (file  I/O  based) 

Binaries  can  be  built  on  demand  from  generated  makefiles 

Developers  need  to  keep  XML  description  up-to-date  with  their  code 
(incremental  effort)  without  switching  to  more  elaborate  approaches 

Concept  is  generally  applicable,  directly  useful  with  other  ocean  models 


Java-Based  GUI  for  Legacy  Binaries 

Prototype  GUI,  accepts  generic  set  of  description  files  and 
generates  user  interface  for  building  and  running  the  binary. 
Implemented  as  an  applet. 

Validates  user  choices,  generates  relevant  scripts 

Integral  part  of  the  Grid-portal  for  LOOPS/Poseidon,  it  can 
be  re-implemented  in  a  more  server-centric  way  (JSP  etc.) 

Future  directions  for  enhancement  include: 

-  Workflow  composition:  Employing  the  descriptions  of  the 
binaries  and  their  input/output  files  as  constraints.  We  are 
currently  using  predefined  workflows. 

-  Context  mediation:  When  dataflow  endpoints  mismatch 


GUI:  validity  checking 


Interactive  Visualization  and  Targeting  of  pdfs  if, 
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Advanced  Visualization  and  Interactive  Systems  Lab:  A.  Love,  W.  Shen,  A.  Pang 
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Interactive  Visualization  and  Targeting  of  pdfs  (cont.) 
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Search  2D  CP  Candidates 
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CONCLUSIONS:  Present  and  Future 


Advanced  systems  for  adaptive  sampling  and  adaptive  modeling  in 
a  distributed  computing  environment 

Web  interfaee,  Remote  visualization,  Metadata  for  code  and  data, 
XML-based  encapsulation  of  software.  Grid  computing 
infrastructure  (SunGridEngine) 

Interdisciplinary  data  assimilation  should  contribute  significantly 
to  understanding,  especially  to  the  quantitative  development  of 
fundamental/ simplified  coupled  models 

More  interdiseiplinary  research  and  education  needed: 
mathematics,  computer  scienee,  physieal-biogeoehemieal- 
aeoustieal  ocean  science,  atmospheric  science,  earth  seienee  and 
complex  system  science 

Short-term  impacts  likely  overestimated,  long-term  effects  likely 
under-estimated 


•  Procedure  can  be  based  on  a 
threshold  for  a  derived  quantity 
or  a  more  complicated  set  of 
rules. 

•  Graphical  output  (in  conjunction 
with  uncertainty  information) 
helps  the  user  plan  sampling 
patterns  and  vehicle  paths. 


F eature  Extraction  for  Adaptive 

Sampling 

•  Developing  automated  ^  ^  ~  ^  ^  /  / 

procedures  to  identify  physical 
features  of  interest  in  the  flow: 
upwelling,  eddies  &  gyres, 
jets/lfonts  etc. 


